Skip to content

Conversation

@SergeantCooper
Copy link

Solves #114339

Implemented the isTruncateFree and isZExtFree virtual functions in the NVPTXTargetLowering class. This implementation optimizes truncation from i64 to i32 and zero-extension from i32 to i64, recognizing that these operations can be considered free in the NVPTX architecture.

Details:

  • The isTruncateFree function returns true for truncation from i64 to i32.
  • The isZExtFree function is implemented to return true for zero-extension from i32 to i64.
  • These changes improve the efficiency of code generation by leveraging hardware capabilities.

Testing:

  • Added lit tests to validate the functionality of these optimizations.

@github-actions
Copy link

github-actions bot commented Nov 2, 2024

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Nov 2, 2024

@llvm/pr-subscribers-backend-nvptx

Author: None (Quark-69)

Changes

Solves #114339

Implemented the isTruncateFree and isZExtFree virtual functions in the NVPTXTargetLowering class. This implementation optimizes truncation from i64 to i32 and zero-extension from i32 to i64, recognizing that these operations can be considered free in the NVPTX architecture.

Details:

  • The isTruncateFree function returns true for truncation from i64 to i32.
  • The isZExtFree function is implemented to return true for zero-extension from i32 to i64.
  • These changes improve the efficiency of code generation by leveraging hardware capabilities.

Testing:

  • Added lit tests to validate the functionality of these optimizations.

Full diff: https://github.com/llvm/llvm-project/pull/114658.diff

3 Files Affected:

  • (modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp (+23)
  • (modified) llvm/lib/Target/NVPTX/NVPTXISelLowering.h (+4)
  • (added) llvm/test/CodeGen/NVPTX/truncate_zext.ll (+17)
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index d3bf0ecfe2cc92..b5fc975d71dfa8 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -3340,6 +3340,29 @@ bool NVPTXTargetLowering::splitValueIntoRegisterParts(
   return false;
 }
 
+bool llvm::NVPTXTargetLowering::isTruncateFree(EVT FromVT, EVT ToVT) const {
+
+  if (!FromVT.isSimple() || !ToVT.isSimple()) {
+    return false;
+  }
+
+  return (FromVT.getSimpleVT() == MVT::i64 && ToVT.getSimpleVT() == MVT::i32);
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(EVT FromVT, EVT ToVT) const {
+  if (!FromVT.isSimple() || !ToVT.isSimple()) {
+    return false;
+  }
+  return (FromVT.getSimpleVT() == MVT::i32 && ToVT.getSimpleVT() == MVT::i64);
+}
+
+bool llvm::NVPTXTargetLowering::isZExtFree(Type *SrcTy, Type *DstTy) const {
+  if (!SrcTy->isIntegerTy() || !DstTy->isIntegerTy())
+    return false;
+  return SrcTy->getPrimitiveSizeInBits() == 32 &&
+         DstTy->getPrimitiveSizeInBits() == 64;
+}
+
 // This creates target external symbol for a function parameter.
 // Name of the symbol is composed from its index and the function name.
 // Negative index corresponds to special parameter (unsized array) used for
diff --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
index c8b589ae39413e..fa73938a35a168 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.h
@@ -616,6 +616,10 @@ class NVPTXTargetLowering : public TargetLowering {
     return true;
   }
 
+  bool isTruncateFree(EVT FromVT, EVT ToVT) const override;
+  bool isZExtFree(EVT FromVT, EVT ToVT) const override;
+  bool isZExtFree(Type *SrcTy, Type *DstTy) const override;
+
 private:
   const NVPTXSubtarget &STI; // cache the subtarget here
   SDValue getParamSymbol(SelectionDAG &DAG, int idx, EVT) const;
diff --git a/llvm/test/CodeGen/NVPTX/truncate_zext.ll b/llvm/test/CodeGen/NVPTX/truncate_zext.ll
new file mode 100644
index 00000000000000..decc02c5840491
--- /dev/null
+++ b/llvm/test/CodeGen/NVPTX/truncate_zext.ll
@@ -0,0 +1,17 @@
+; RUN: llc -march=nvptx64 < %s | FileCheck %s
+
+; Test for truncation from i64 to i32
+define i32 @test_trunc_i64_to_i32(i64 %val) {
+  ; CHECK-LABEL: test_trunc_i64_to_i32
+  ; CHECK: trunc
+  %trunc = trunc i64 %val to i32
+  ret i32 %trunc
+}
+
+; Test for zero-extension from i32 to i64
+define i64 @test_zext_i32_to_i64(i32 %val) {
+  ; CHECK-LABEL: test_zext_i32_to_i64
+  ; CHECK: zext
+  %zext = zext i32 %val to i64
+  ret i64 %zext
+}
\ No newline at end of file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants